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Abstract 

We show how coupUng of local optimization processes can lead to better solutions 
than multi-start local optimization consisting of independent runs. This is achieved 
by minimizing the average energy cost of the ensemble, subject to synchronization 
constraints between the state vectors of the individual local minimizers. Prom an 
augmented Lagrangian which incorporates the synchronization constraints both as 
soft and hard constraints, a network is derived wherein the local minimizers interact 
and exchange information through the synchronization constraints. Prom the view- 
point of neural networks, the array can be considered as a Lagrange programming 
network for continuous optimization and as a cellular neural network (CNN). The 
penalty weights associated with the soft state synchronization constraints follow from 
the solution to a linear program. This expresses that the energy cost of the ensemble 
should maximally decrease. In this way successful local minimizers can implicitly 
impose their state to the others through a mechanism of master-slave dynamics re- 
sulting into a cooperative search mechanism. Improved information spreading within 
the ensemble is obtained by applying the concept of small- world networks. Wc illus- 
trate the new optimization method on two different problems: supervised learning 
of multilayer perceptrons and optimization of Lennard-Jones clusters. The initial 
distribution of the local minimizers plays an important role. Por the training of mul- 
tilayer perceptrons this is related to the choice of the prior on the interconnection 
weights in Bayesian learning methods. Depending on the choice of this initial dis- 
tribution, coupled local minimizers (CLM) can avoid overfitting and produce good 
generalization, i.e. reach a state of intelligence. In potential energy surface optimiza- 
tion of Lennard-Jones clusters, this choice is equally important. In this case it can 
be related to considering a confining potential. This work suggests, in an interdisci- 
plinary context, the importance of information exchange and state synchronization 
within ensembles, towards issues as evolution, collective behaviour, optimality and 
intelligence. 
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1 Introduction 



A large variety of problems arising in engineering, physics, chemistry and economics can 
be formulated as optimization problems, either constrained or unconstrained having con- 
tinuous or discrete variables. A well developed area in optimization theory are local op- 
timization methods including conjugate gradient, quasi-Newton, Levenberg-Marquardt, 
sequential quadratic programming etc. [Bertsekas, 1996; Fletcher, 1987; Gill et al, 1981]. 
Continuous-time optimization methods have been developed within the area of neural net- 
works [Cichocki & Unbehauen, 1994]. Popular methods for global exploration of the search 
space are e.g. simulated anneahng [Kirkpatrick et al., 1983] and genetic algorithms [Gold- 
berg, 1989]. Within standard local optimization techniques one often applies multi-start 
local optimization, by trying different starting points and running the processes indepen- 
dently from each other and selecting the best result from all trials. On the other hand, 
for training of neural networks such as multilayer perceptrons (MLP) it is well-known that 
instead of training several MLPs for random choices of small initial weights and selecting 
the best of all trained networks, one better forms a committee network which is based upon 
all trained networks [Arbib, 1995; Bishop, 1995]. In other words, the training efforts of the 
less optimal networks are not entirely useless but can be employed in order to improve the 
estimate in view of the bias- variance trade-off [Bishop, 1995]. Unfortunately, committee 
networks are only applicable to static nonlinear regression and classification problems. 

In this paper we propose a new methodology of coupled local minimizers (CLM) for 
solving continuous nonlinear optimization problems. We pose a somewhat similar chal- 
lenge as for committee networks but within a different and broader context of solving 
differentiable optimization problems. The aim is to (on-line) combine the results from 
local optimizers in order to let the ensemble generate a local minimum that is better than 
the best result obtained from all individual local minimizers. We show how improved local 
minima can be obtained by having interaction and information exchange between the local 
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search processes. This is reahzed through state synchronization constraints that are im- 
posed between the local minimizers by incorporating principles of master-slave dynamics. 
Synchronization theory has been intensively studied within the area of chaotic systems and 
secure communications [Chen & Dong, 1998; Pecora & Carroll, 1990; Suykens et al., 1996, 
1997, 1998; Wu & Chua, 1994]. The CLM method is related to Lagrange programming 
network approaches for chaos synchronization [Suykens & Vandewalle, 2000], where iden- 
tical or generalized synchronization constraints are imposed on dynamical systems. CLMs 
also fit within the framework of Cellular Neural Networks (CNN) [Chua & Roska, 1993; 
Chua et al, 1995; Chua, 1998]. By considering the objective of minimizing the average 
cost of an ensemble of local minimizers subject to pairwise synchronization constraints, 
a continuous-time optimization algorithm is studied according to Lagrange programming 
networks [Cichocki & Unbehauen, 1994; Zhang & Constantinides, 1992]. The resulting 
continuous-time optimization algorithm is described by an array of coupled nonlinear cells 
or a one-dimensional CNN with bi-directional coupling. 

We show how to obtain intelligence from CLMs. This is done by considering a problem 
of static nonlinear regression with MLPs from given noisy measurement data. In this case 
CLMs correspond to coupled backpropagation processes [Rumelhart et al, 1986; Werbos, 
1990]. In order to obtain an MLP with good generahzation performance (i.e. an inteUigent 
solution) it is usually needed to consider a regularization term in addition to the original 
sum squared error cost function (fitting error) which is defined on the training data [Bishop, 
1995; MacKay 1992; Poggio & Shelton C. 1999; Suykens & Vandewalle, 1998]. When 
applying CLMs such a regularization term is not needed. It turns out that the initial 
distribution of the initial states of the several local minimizers can play this role, such 
that one only has to optimize the training set error. This is consistent with insights 
from Bayesian learning of neural networks where the regularization term is related to the 
choice of the prior on the unknown interconnection weights, which expresses that the initial 
weights should be chosen small (weight decay). 



4 



The role of the initial CLM state is not only important for neural networks but also with 
respect to potential energy surface optimization of Lennard- Jones clusters, a second class of 
problems that is investigated in this paper. It is considered to be an important benchmark 
problem in the area of protein folding [Sah et al, 1994; Neumaier, 1997; Wales & Scheraga, 
1999]. We show that CLMs are able to detect the global minimum of (LJ)38 which possesses 
a double-funnel energy landscape and is known to be a challenging test-case [Wales & 
Doye, 1997; Wales et ai, 1998; Wales & Scheraga, 1999]. The CLM method has also 
been successfully applied to larger scale problems. The cooperative search mechanism in 
CLMs is obtained by solving a linear programming problem in the unknown soft constraint 
penalty factors. In this way a maximal decrease in the average energy cost of the ensemble 
is achieved. 

This paper is organized as follows. In Section 2 we introduce coupled local minimizers. 
In Section 3 we discuss how to obtain optimal interaction between local minimizers. In 
Section 4 we present new insights on coupled backpropagation processes and intelligence. In 
Section 5 we apply coupled local minimizers to the optimization of Lennard-Jones clusters. 

2 A CNN of Coupled Local Minimizers 

Consider the minimization of a twice continuously differentiable cost function U (x) with 
X e M". Let us take an ensemble consisting of q local minimizers. We aim at minimizing 

the average energy cost {U) = ^ ^/[x^*)] of this ensemble subject to pairwise state 
synchronization of the particles: 

min (U) such that x^^) - x(^+^) = 0, i = 1, 2, q (1) 

with particle states x*^*) e M."' for i — l,2,...,q and boundary conditions x*^°^ = x^*', 
x(9+i) — x(^). The synchronization constraints have to be achieved in an asymptotical 
sense, i.e. the particles have to reach the same final state. One defines the augmented 
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Lagrangian 



£(x»,A«)=^j:c/[x«]+ix:7dix' 



(i) _ x(^+i)||2 + ^(A», [x« - x(^+i)]) (2) 



i=l i=l 



i=l 



with Lagrange multipliers A^*^ e (i = 1, 2, g). The different terms in this Lagrangian 
are the objective function, the soft constraints and the hard constraint related to the pair- 
wise state synchronization constraints. The penalty factors 7^ emphasize the importance of 
each of the soft synchronization constraints. Prom this augmented Lagrangian one obtains 
the Lagrange programming network [Zhang & Constantinides, 1992] 



This is basically a continuous-time optimization algorithm for solving the given constrained 
optimization problem. One obtains the following array of coupled local minimizers (CLM) 



with learning rate rj. This array consists of a number of q coupled nonlinear cells and is 
considered to be a special case of cellular neural networks (CNN) [Chua, 1998]. 

The basic mechanism of state synchronization with coupled local minimizers is illus- 
trated in Fig. 1 for a double potential well problem with two particles that are searching 
for the global minimum. The main idea here is to impose that they should reach the same 
final position. A simphfied CLM considered in this case is 




(3) 



A« = xW -x(^+i) , i = 1,2, 



(4) 



X = -V^C/(x) -{x-z)-X 
< z = -V,U{z) + {x-z) + X 



(5) 



A = X — z 



V 



where x,z,X E M. This is derived from the augmented Lagrangian 
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As a result of the constraining of the system, the two particles are enforced to take a decision 
about which valley to choose. When the initial states of the two particles are located in 
different valleys, one observes that they are always reaching the global minimum. This 
phenomenon is independent from the steepness and shape of the valleys. In this process, 
the search space is in fact duplicated and the local optimizers are exchanging information 
through the synchronization constraint. 

Note that it is not necessarily guaranteed beforehand that the equality constraints in 
(|I|) will be exactly realized by the CLM, as is known for Lagrange programming networks. 
In this case one could consider as CLM goal that the state synchronization constraints 
should be realized in an approximate sense, i.e. bringing the states close to each other. 
Eventually, the equality constraints could be replaced by a set of inequality constraints 
which would implement this. 



In this Section, we discuss how to obtain an optimal interaction between the several local 
minimizers. This is done by making a suitable choice of the penalty weights 7,. In this 
design procedure we aim at maximally decreasing the instantaneous average energy cost 
of the ensemble 



For a given state vector x = [x^^^; x*^^); x^^^] and costate of the ensemble A = [A*^^^; A^^-*; 
A^*?)] the latter expression is affine in 7 = [71; 72; 7^] and the interaction between the 



3 CLMs and Optimal Cooperative Search 





+ 7.-i[x(^-')-x«] 



(7) 
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local minimizers is optimized then by solving the linear program (LP): 



d(U) 
mm — ; — 

-yeM.i at 



x,A 



such that 7 < 7j < 7, i 



1,2,... 
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where 7,7 are user-defined lower- and upperbounds. The resulting 7j values are kept 
constant for simulation of the CLM over a certain time interval AT. The difference between 
the 7j values in causes a similar effect as is known in master-slave (drive-response) 
synchronization which has been studied for secure communications using chaos [Chen & 
Dong, 1998; Pecora & Carroll, 1990; Suykens et ai, 1997, 1998; Wu & Chua, 1994]. In 
this way, successfully performing local optimizers impose their state to the other part of 
the ensemble, which leads to cooperative behaviour. 

The step size rj is determined here by imposing a target evolution law 



where a is a user-defined positive real constant and U* is an estimate of the energy cost 
value at the global minimum (alternatively one adds a constant value to the cost function 
such that the energy cost is guaranteed to be positive everywhere) . As a result the step size 



V becomes v = b ELi(^, h) + ga(E. f/[x«] - gf/*)]/ ELi(^, ^) subject 



to additional user-defined constraints t] < r] < fj where h = 7j_i[x''*~^'' — x^*^] — 7j[x''*-' — 
x(*+i)] + A*-*^^-* — A*-*^ by definition. The values of 7^, 77 are re-scheduled each time in between 
simulations of the array over user-defined time intervals AT. 

According to the idea of small- world networks [Watts & Strogatz, 1998], the information 
spreading within the ensemble can be further improved, e.g. by re-numbering a part of the 
ensemble (state vectors together with their corresponding co-state) at random every cAT 
steps where c G Nq. In this way the pairwise synchronization constraints vary among the 
different local minimizers during optimization. 

In order to obtain insight in the infiuence of the CLM tuning parameters let us consider 




(9) 
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the optimization of the cost function 

^ « n n 

U (x) = —a ^ + 8n — 4n JJ^ cos{ijOiXi) ~ 4n JJ^ cos(a;2Xi) (10) 

i=l i=l i=l 

with a — 0.01, cui — 0.2, 002 — 1 in an — 10 dimensional search space [Stybhnski & 
Tang, 1990]. The global minimum for this problem is known and located at the origin with 
U{0) = and U* = is taken. The solutions from CLMs improve with respect to multi- 
start local optimization, even when a quasi-Newton method instead of steepest descent 
optimizers (as for the CLM) are taken. In these comparisons the initial states have been 
random uniformly generated in [—20 20]" where the same initial states have been taken 
for the different algorithms and several runs of the algorithms have been done. CLMs with 
q > 20 are reaching the global minimum in this case. Prom this example it follows that 
in order to optimize a surface over a certain region in search space, the number q needed 
to achieve a good performance depends on the complex shape of the surface or typically 
on the number of local minima per volume in search space that one intends to explore. 
Experiments suggest that a factor 7/7 = 10 is usually a good choice. In this way the energy 
level or cost function of the optimizers remains bundled. Otherwise the simulations could 
lead to excessive exploration and decreased speed of convergence. Typically, larger values 
of 7 will require shorter AT intervals. The choice of the pressure coefficient a will control 
the average energy decrease of the ensemble. Demanding a high pressure a, however, 
might lead to less exploration and worse local minima. The bounds on rj are taken here 
as 77 = 10~^, f] — 10^. Although it is not necessary, the CLM performance can be further 
enhanced by applying the small- world network concept, by re-numbering at random (here 
20% of the local minimizers) every cAT steps where c = 5 typically, leading to improved 
information spreading among the optimizers. For the simulations here and in the sequel a 
Runge-Kutta integration rule with adaptive step size (absolute and relative error equal to 
10~^) has been used. 
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4 Intelligence and Coupled Backpropagation 

In this Section we apply CLMs to the training of multilayer perceptron networks 

y ^ w'^ tsinh{Vu + p) (11) 

with input vector u e M'", output y e M, interconnection weights w e M"'*, V e MJ^hxm 
and bias vector P e R"'*, where Uh denotes the number of hidden units. Let us denote 
the unknown parameter vector as 9 — [w;V{:); l3] which stacks all the weights into one 
single vector. Since the introduction of the backpropagation algorithm for training of 
neural networks [Rumelhart et ai, 1986; Werbos, 1990], important insights have been 
obtained about regularization (weight decay) of the cost function in order to have a good 
generahzation ability in view of the bias-variance trade-off [Bishop, 1995; MacKay, 1992; 
Suykens & Vandewalle, 1998]. It is well-known that instead of optimizing the cost function 

rmnJ{9) = -Y,[dk-yk{9)r (12) 
fc=i 

where (ifc arc the desired target outputs (with given data set {uk, dk}^^i of N training 
data), one better considers the following cost function with regularization 

1 ^ 1 
mm r^%9; C) = C^Y^^d, - yk{9)r + f^-9^9 (13) 

k=l 

where /x, C arc additional hyperparameters which can be automatically tuned within the 
framework of Bayesian learning. The second term is related to a prior distribution on the 
weights which typically expresses that the initial weights are chosen small. The first term 
is associated with the likelihood. When networks are overparamctrizcd (too many hidden 
units), regularization enables to work implicitly with a number of effective parameters 
that is less than the actual number of interconnection weights such that overfitting can be 
avoided. 

Let us now consider a CLM by taking as state vectors x*^*) {i = 1, ...,q) the unknown pa- 
rameter vectors 9^'^\ Hence, the CLM consists in fact of coupled backpropagation processes 
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with weight vector synchronization. In Fig. 2 an example of estimating a sine function from 
given noisy data is given where an overparametrized MLP is taken. Simulation results show 
that by CLMs one is able to obtain a similar smooth solution as from (p!3D with Bayesian 
learning but by optimizing the original cost function (|12D without regularization. In order 
to obtain this intelligent solution (i.e. a good generalization performance), the initial dis- 
tribution of x'^*^(0) (i = 1, ...,q) (at time 0) plays an important role, similar with the choice 
of /i in ([T3|). Consider the following distribution of the initial CLM state 

p(x(0)) cx exp[-^x(0)^x(0)] (14) 

which assumes that the states of the local minimizers x*^*)(0) are normally distributed. The 
value of a then plays the role of a bifurcation hyperparameter. For a chosen very large 
a solution with bad generalization is obtained, while for a small (small initial weights) 
a good generalization performance is obtained. Hence, CLMs can avoid the overfitting 
phenomenon without the use of a regularization in the sum squared error cost function. In 
order to obtain the results with good generalization in Fig. 2, the following CLM tuning 
parameters were taken: q = 20, a = 0.1, a = 10^, 7 = 10^, 7 = 10'', r] = 10^^, r/ = 10^^, 
U* = and 20% of the local minimizers is re-numbered after c = 5 with AT = 10~^. 



5 Optimization of Lennard-Jones Clusters 

In this Section we discuss the application of CLMs to the optimization of Lennard-Jones 
clusters. In predicting the three-dimensional structure of proteins from amino acid se- 
quences, potential energy surface (PES) minimization is often related to the native struc- 
ture of the protein. Optimization of Lennard-Jones (LJ) clusters is considered to be an 
important benchmark problem at this point [Sali et ai, 1994; Neumaier, 1997; Wales & 
Scheraga, 1999]. Recently, detailed studies have been reported about optimization of (LJ)Ar 
with < 150 by means of basin-hopping techniques in comparison with many other ap- 
proaches [Wales & Doye, 1997; Wales et al, 1998; Wales & Scheraga, 1999]. The cost 
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function in this case is given by 



1 



1 



(15) 



i<j 




where the Euchdean distance between atom i and j {j — 1,...,N). (LJ)38 which 
possesses a double-funnel energy landscape and is known to be an interesting test-case. 

Fig. 3 shows the result of a CLM run apphed to (LJjag. The CLM tuning parameters 
in this case are ^ = 50, a = 10^ 7 = 10^ 7 = 10^ 77 = 10"^ rj = 10^ C/* = 
{Ulj = Ulj + S with S = 200 is optimized in order to ensure a positive cost function 
everywhere) and 10% of the local minimizers is re-numbered after c = 5 with AT = 10~^. 
The CLM energy bundle caused by state synchronization of the local minimizers is clearly 
visible. Like in the application of CLMs to the training of neural networks, the choice of 
a in p(x(0)) plays an important role. In Fig. 3 the differences are illustrated for cr = 0.1 
versus a — 1. The choice of this prior implicitly corresponds in fact to an additional 
confining potential energy term, which has been explicitly considered in effective potential 
minimization methods [Schelstraete & Verschelde, 1997] (in a different form). However, 
the effective potential minimization method has problems with detecting e.g. the global 
minimum of (LJ)8^9, which on the other hand are easily found by CLMs. In the example of 
(LJ)38 a classical local optimization technique (quasi-Newton and scaled conjugate gradient 
for large scale problems) has been used for post-processing and fine-tuning (about 10-20 
additional iterations) of the CLM results. The variety of generated solutions includes the 
global minimum -173.9284. 

In the application of CLMs to larger clusters, first a shifted version of the problem 
^s/w/t _ _ Jias been minimized in order to avoid ill-conditioning 

in the LP problem and excessively large values in the initial CLM evolution. For (LJ)i5o 
/J, — 0.1, 1/ — 3 has been chosen. These results have been taken then as initialization for 
optimization of Ulj and are, eventually, post-processed by a classical local optimization 
method. The CLM tuning parameters in the (LJ)i5o case are q — 100, a — 0.1, a — 
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105, U* = 0, C/^y/*''' = c/^y/* + S with 5 = 3000 and C/|j ^ Ulj + S with 5 = 1000. 
The other tuning parameters for (LJ)i5o are the same as for the (LJ)38 example. This 
CLM run yields a PES value of -872.91, while the best known minimum is -893.31 in this 
case, obtained by counting nearest-neighbour interactions for icosahedral packing schemes 
(Northby). However, like simulated annealing and basin- hopping approaches, our focus is 
here on unbiased methods which are generally applicable. 

One can argue that CLMs are providing good solutions given the user-defined initial 
distribution (prior). In this sense, CLM results can potentially be further improved by 
exploiting more informative prior knowledge about the problem. The simulation results 
show that this becomes more important for larger clusters. Simulations have been done on 
a Sun Ultra 2 workstation in Matlab with cmex implementation of the PES evaluation and 
gradient calculations. CLM running times are roughly comparable with q times applying 
a standard gradient based local optimization methods (conjugate gradient). 

6 Conclusions 

We have introduced a new optimization method of coupled local minimizers. This is 
done by considering the average energy cost of the total ensemble subject to pairwise syn- 
chronization constraints that are asymptotically reached. The resulting continuous-time 
optimization algorithm from Lagrange programming networks is an array of coupled non- 
linear cells or CNN. For problems of MLP training in supervised learning, we have shown 
that CLMs are able to produce intelligent solutions, in the sense that one can obtain a 
good generalization ability without taking a regularization term in the cost function. The 
obtained insights are consistent with Bayesian learning methods for MLPs. We have also 
shown how one can let the ensemble of local minimizers cooperate in an optimal way. This 
is done by solving additional linear programming problems. The CLM method has been 
successfully applied to the optimization of Lennard- Jones clusters which is an important 
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benchmark problem in the area of protein folding. We expect that the proposed CLM 
methodology may offer new insights for the application of continuous-time optimization 
algorithms to NP complete problems in general. It also emphasizes the importance of the 
role played by the initial state distribution of the local minimizers that form the CLM 
ensemble. 
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Captions of Figures 



Fig. 1 Illustration of the basic state synchronization mechanism in coupled local minimiz- 
ers (CLM): (A) double well cost function U{x) — x^ — 16x^ + 5x+100 with global minimum 
located at x ^ -2.90; (B) CLM with x = -VU{x) -{x-z)-X, z ^ -VU{z) + {x-z) + X, 
X = X — z, where x,z E M. (red and blue, respectively) are the two particle states with 
costate A e R (green) . The CLM corresponds to the Lagrange programming network with 
cost function U{x) +U{z) subject to x — z. Except when both initial states a:(0), z{0) are 
positive, the global minimum is always reached by this CLM. Similar results are obtained 
for other double well problems with broad and narrow minima. 

Fig. 2 Supervised learning of multilayer perceptrons on a benchmark problem of training 
a sinusoidal function (green) given 20 training data points {uk, dk} (blue circles) corrupted 
with Gaussian noise (zero mean, standard dev. 0.4) using a multilayer perceptron with one 
hidden layer consisting of 10 hidden units. Given the fact that the MLP consists of 30 
unknown parameters, overfitting will occur with standard training methods, unless regular- 
ization or early stopping is applied. In (A) results from standard neural network training 
methods are shown: scaled conjugate gradient without early stopping leading to overfit- 
ting (red); best result by Bayesian learning with regularization of the cost function and 
automatic hyperparameter selection, leading to good generalization (blue). (B) shows that 
a CLM which optimizes a sum squared error on the training data without regularization 
of the cost function can produce a result (blue) which is comparable to the quality of the 
Bayesian learning solution in (A), provided the a value in p(x(0)) oc cxp(— 2^x(0)^x(0)) 
is well chosen (here a = 0.1). (C) shows a result from a CLM for a bad choice of cr = 5. 
The large variance among the resulting local minimizers of the CLM is clearly visible, in 
comparison with (B) where the variance is small with good generalization. In (D) the 
CLM evolution with g = 20 of the sum squared error cost function is shown during opti- 
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mization, related to the results of (B). 



Fig. 3 CLM application to potential energy surface minimization of Lennard- Jones clus- 
ters: (A) CLM evolution of U[j for g = 50 on (LJ)38; (B) Importance of the choice of 
the initial distribution: a — 1 (red), cr = 0.1 (blue) with post-processing by means of a 
quasi-Newton method resulting into a = 1 (magenta), a = 0.1 (green), respectively. Shown 
are the mean (dashed line), min and max (solid lines) over 7 CLM runs after sorting the 
energy values. The variety of solutions includes the global minimum -173.9284, visualized 
in (C). In (D) a CLM result with q — 100 is shown for a larger cluster (LJ)i5o. 
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